Reference-Free Population Genomics from Next-Generation Transcriptome Data and the Vertebrate–Invertebrate Gap
نویسندگان
چکیده
In animals, the population genomic literature is dominated by two taxa, namely mammals and drosophilids, in which fully sequenced, well-annotated genomes have been available for years. Data from other metazoan phyla are scarce, probably because the vast majority of living species still lack a closely related reference genome. Here we achieve de novo, reference-free population genomic analysis from wild samples in five non-model animal species, based on next-generation sequencing transcriptome data. We introduce a pipe-line for cDNA assembly, read mapping, SNP/genotype calling, and data cleaning, with specific focus on the issue of hidden paralogy detection. In two species for which a reference genome is available, similar results were obtained whether the reference was used or not, demonstrating the robustness of our de novo inferences. The population genomic profile of a hare, a turtle, an oyster, a tunicate, and a termite were found to be intermediate between those of human and Drosophila, indicating that the discordant genomic diversity patterns that have been reported between these two species do not reflect a generalized vertebrate versus invertebrate gap. The genomic average diversity was generally higher in invertebrates than in vertebrates (with the notable exception of termite), in agreement with the notion that population size tends to be larger in the former than in the latter. The non-synonymous to synonymous ratio, however, did not differ significantly between vertebrates and invertebrates, even though it was negatively correlated with genetic diversity within each of the two groups. This study opens promising perspective regarding genome-wide population analyses of non-model organisms and the influence of population size on non-synonymous versus synonymous diversity.
منابع مشابه
Clustering of Short Read Sequences for de novo Transcriptome Assembly
Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...
متن کاملSCARF: maximizing next-generation EST assemblies for evolutionary and population genomic analyses
SUMMARY Scaffolded and Corrected Assembly of Roche 454 (SCARF) is a next-generation sequence assembly tool for evolutionary genomics that is designed especially for assembling 454 EST sequences against high-quality reference sequences from related species. The program was created to knit together 454 contigs that do not assemble during traditional de novo assembly, using a reference sequence li...
متن کاملBenchmarking next-generation transcriptome sequencing for functional and evolutionary genomics.
Next-generation sequencing has opened the door to genomic analysis of nonmodel organisms. Technologies generating long-sequence reads (200-400 bp) are increasingly used in evolutionary studies of nonmodel organisms, but the short-sequence reads (30-50 bp) that can be produced at lower cost are thought to be of limited utility for de novo sequencing applications. Here, we tested this assumption ...
متن کاملEvoPipes.net: Bioinformatic Tools for Ecological and Evolutionary Genomics
Recent increases in the production of genomic data are yielding new opportunities and challenges for biologists. Among the chief problems posed by next-generation sequencing are assembly and analyses of these large data sets. Here we present an online server, http://EvoPipes.net, that provides access to a wide range of tools for bioinformatic analyses of genomic data oriented for ecological and...
متن کاملGeneration Gap and its Effective Socio-cultural Factors Involved among the Students of Andimeshk I.A.U
This research studies the generation gap among the students at Andimeshk Islamic Azad University. More than 305 people were chosen by Cochran Formula in the study. The dependent variable was divided to four dimensions social, cultural, political and economic. The obtained mean of the study shows there is a meaningful difference between students and their parents (fathers and mothers). It also s...
متن کامل